A Perspective on the Lexicographic Value of Mega Newspaper Corpora — The Case of Afrikaans in South Africa
نویسنده
چکیده
The aim of this article is to assess the potential use of a mega newspaper corpus, the Media24 archive, in the absence of large balanced and representative corpora, for the compilation of major general dictionaries for Afrikaans. Firstly, an evaluation of Media24 against the lemmalists of both a major single-volume and a multi-volume monolingual dictionary for Afrikaans is undertaken to determine to what extent Media24 correlates with the lemmalists of major dictionaries. Secondly, the strength/suitability of Media24 for lemma selection in categories other than newspapers is evaluated. Finally, it is determined what the contribution could be of Media24 to lexical sense distinction, selection of examples of usage, and typical collocations.
منابع مشابه
A Comparative Analysis of Lexical Bundles in Journalistic Writing in English and Persian: A Contrastive Linguistic Perspective
This paper investigates the use of ‘lexical bundles’ in two broad corpora of journalistic writing. The aim of this study is to compare the use of lexical bundles in the two domains, one consisted of newspaper articles written in English and published in England and the other one comprised of newspaper articles written in Persian from Iranian publications. For this purpose, the frequency...
متن کاملA Comparative Analysis of Lexical Bundles in Journalistic Writing in English and Persian: A Contrastive Linguistic Perspective
This paper investigates the use of ‘lexical bundles’ in two broad corpora of journalistic writing. The aim of this study is to compare the use of lexical bundles in the two domains, one consisted of newspaper articles written in English and published in England and the other one comprised of newspaper articles written in Persian from Iranian publications. For this purpose, the frequency...
متن کاملDeveloping a Broadband Automatic Speech Recognition System for Afrikaans
Afrikaans is one of the eleven official languages of South Africa. It is classified as an under-resourced language. No annotated broadband speech corpora currently exist for Afrikaans. This article reports on the development of speech resources for Afrikaans, specifically a broadband speech corpus and an extended pronunciation dictionary. Baseline results for an ASR system that was built using ...
متن کاملThe Representation of Iran’s Nuclear Program in British Newspaper Editorials: A Critical Discourse Analytic Perspective
In this study, Van Dijk’s (1998) model of CDA was utilized in order to examine the representation of Iran’s nuclear program in editorials published by British news casting companies. The analysis of the editorials was carried out at two levels of headlines and full text stories with regard to the linguistic features of lexical choices, nominalization, passivization, overcompleteness, and voice....
متن کاملRadio-epidemiological evaluation and remediation in water sources from two mines in South Africa
Background: In this study, the health risk associated with three types of drinking waters (fissure, underground treated and surrounding waters) from two mines in South Africa were assessed. Materials and Methods: The measurement of the radionuclides concentration was carried out by liquid scintillation counter and alpha spectrometer. The estimated radiological risk pose to the people consuming ...
متن کامل